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Abstract: This paper presents an intelligent system allowing handicapped aphasiacs to 
perform basic communication tasks. It has the following three key features: (1) A 6-sensor 
data glove measures the finger gestures of a patient in terms of the bending degrees of his 
fingers. (2) A finger language recognition subsystem recognizes language components 
from the finger gestures. It employs multiple regression analysis to automatically extract 
proper finger features so that the recognition model can be fast and correctly constructed 
by a radial basis function neural network. (3) A coordinate-indexed virtual keyboard allows 
the users to directly access the letters on the keyboard at a practical speed. The system 
serves as a viable tool for natural and affordable communication for handicapped aphasiacs 
through continuous finger language input. 

Keywords: handicapped aphasiacs; data glove; finger gestures; finger language; 
neural network 
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1. Introduction 

Disabled people suffering from severe impairments usually face an acute problem: most common 
interaction modalities are unavailable and their communication capabilities are limited. In fact, it can 
be very frustrating for them to interact with the computer. They require and expect AAC 
(Augmentative and Alternative Communication) systems to partly alleviate physical limitations. In 
general, AAC systems are usually equipped with special access mechanisms, since the users lack the 
fine motor-control required to operate such systems with standard peripheral devices. 

Though an array of modern tools and technologies for AAC has been developed worldwide to assist 
the disabled, they still have some drawbacks, to name a few: 

(a) Most of them are imported and expensive. 

(b) They are not tailored to local socio-cultural contexts. 

(c) The target users lack necessary motor control to operate associated peripherals. 

The objective of this work is to develop an intelligent and easy-to-use communication system 
(including proper hardware and software) that makes it possible for handicapped aphasiacs to perform 
basic communication tasks naturally and affordably. 

1.1. Finger Language 

Sign language is popularly used in the deaf-mute circle, but the vocabularies are only a small set of 
Mandarin and the deaf-mute find it hard to express themselves accurately in sign language. Sign 
language is utterly difficult for handicapped aphasiacs. It involves too many large-scale movements 
including combined expressions of fingers, palms, and even arms. For severely handicapped sufferers, 
this language system is simply too far off in terms of their needs. 

Compared with the well-developed sign language scheme, the finger language approach is still at a 
burgeoning research stage. It originally came from a concept developed at the Southwest Research 
Institute in 1993 [1] in which a mechanical hand was proposed for the general deaf-and-blind. The 
mechanical hand was claimed to be able to convert the English alphabet to proper hand gestures. Based 
on the concept, the Stanford Mechanical Department team led by Dr. Gilden pioneered the 1st 
communicating mechanical hand, DEXTER, which later had evolved from DEXTER-I to DEXTER-IV 
models. The DEXTER series made it possible to convert the 26 letters of the English alphabet to hand 
gestures from keyboards, computers, TV, and telecommunication devices. By their systems, even 
deaf-and-blind people then became capable of 'reading' by touching the gesture-varying mechanical 
hand [2]. 

The idea of gestures for pronunciation was first proposed by Lyon [3]. It allows the user to type 
words or sentences by making the gestures corresponding to the contained syllables. On this basis, the 
Hanyu Pinyin System was proposed as the Chinese phonetic symbol set in China. Researchers in 
China use CAS-gloves, cyber-gloves, or colored gloves to conduct language translation from 30 basic 
hand gestures, designed for pronunciation and named finger-spelling (or manual alphabet or 
dactylology), into the simplified form of Chinese characters, including the English alphabet and four 
special consonants: ZH, CH, SH, and NG [4,5]. However, their studies still use the positions and 
movement orbits in three-dimensional space of both hands. Due to the great differences in culture, 
local language habits, and even pronunciation, their developed systems are not entirely applicable in 
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Taiwan, which uses the traditional more complicated form of Chinese characters instead of the 
simplified form as used in mainland China. In addition, our purpose strongly emphasizes the very 
limited functionality of the handicapped aphasiacs. We need a system that can detect very weak signals 
caused by very small finger movements and successfully decode the meaning. This requirement is very 
different from the traditional finger spelling. 

1.2. Virtual Keyboard 

Efficient text entry in AAC systems is very important in helping the disabled to produce statements 
as easily and quickly as possible during conversation, since they have great difficulty in back-and-forth 
motions. Many approaches have been proposed. For example, the Chord Keyboard can help the 
disabled as long as they hit a large button or stir a chord [6], even when he/she is in motion or on 
bumpy roads. Demasco and McCoy worked on the line-of-gaze technique that tracks eye motion by 
video camera [7]. The patient makes a direct selection by staring somewhere on a large digitizing 
tablet. Manaris, et al. considered a listening keyboard as a speech input interface for patients [8]. There 
is a Japanese text input method for a small 12-button mobile phone keypad [9]. 

Most virtual keyboards are based on the physical layout of a standard keyboard. Before the user can 
select anything on the screen, he needs some physical instruments for a direct interaction. One of the 
most essential devices for accessing a computer is the keyboard. People who have no fine motor 
control can resort to the scanning mechanism associated with the keyboard. The scanning technology 
moves the cursor row by row until the user selects the desired one by a key or a switch. However, it is 
difficult for the handicapped aphasiacs to do this in a timely way. 

1.3. Data Gloves 

Considering that the most natural way to operate a computer is by body gestures, some special 
hardware device that can help the system to extract the meaning of the body gestures is necessary. The 
fact that the finger language designed for disabled aphasiacs involves only very tiny finger movements 
imposes very stringent requirements on both the software and hardware of any prospective systems. In 
this system, we advocate the use as the input device of a data glove that is capable of sensing the 
slightest movement. Although there are already several imported data gloves of the optic fiber sensor 
type, they are overly complicated and costly for the intended finger language applications. For 
example, a data glove from various companies like VPL Research Inc., or 5 DT Inc. may cost from 
US$2,700 to US$895. This fact has motivated the authors' initial attempt to make a reliable but low 
cost (under US$150) data glove. 

As past experience points out, the major problems with most finger systems are two fold. One is 
system reliability and accuracy; the other is the differences in the movements of each individual 
patient. To overcome such problems, besides the data glove, establishing a delicate tiny-movement- 
discerning software system is indispensable. 
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2. The Approach 

To avoid eavesdropping, ancient Chinese businessmen often settled a transaction by performing 
specific gestures inside spacious sleeves. They called it "the universe inside the sleeves", which could 
be dubbed the earliest figure finger language. We borrow from this idea and use figure gestures as the 
basic input symbols to develop a communication system for handicapped aphasiacs. Our intelligent 
communication system involves three key concepts: a data glove to input finger language components, 
a finger language recognition subsystem to recognize the finger language components, and a virtual 
keyboard to produce text from the finger language components. The center concept is finger language 
components. We have defined 12 finger language components in terms of proper finger gestures to 
support the system. 

2.1. Data Glove as the Input Device 

We have manufactured a data glove that uses only light-emitting diodes (LED) and photo-detectors 
(PD) (i.e., without optic fibers, light from LEDs will reach PDs on line of sight) in measuring the 
bending degrees of fingers, as illustrated in Figure la [10]. The developed data glove is coupled with a 
virtual keyboard simulated as a predetermined 15 x 6 matrix keying system, which can display the 
intended message on a screen. Specifically, the x and y coordinates on the virtual keyboard are decided 
by the movements of the index and middle finger, respectively. Once the desired letter is highlighted, 
the user then bends his thumb to give the confirmation signal to complete a letter selection sequence. 
The system described above has two drawbacks and this work proposes two corresponding solutions, 
as explained below. 

(a) The user needs to carefully move the fingers along the axes to select the desired letter and to 
make prompt affirmation response (by bending the thumb) when the cursor moves to the letter he 
intends. This requirement usually involves rather delicate finger movements. It is unrealistic that we 
assume a handicapped aphasiac is capable of this ability. Our solution is to completely discard the 
finger-movement-controlled cursor design. Instead, we allow the user to designate a number by his 
gesture and then use the number to directly move the cursor to the correct position of a letter. In such a 
way, the user can choose a most becoming speed to enter the text without stress. 

(b) It cannot provide a suitable environment for the user to do continuous finger language input as 
discussed in [11]. For continuous input, we need to guarantee that between the inputs of two 
consecutive gestures, we always can successfully distinguish them without mistaking the transition to 
some other gestures. That is, while the fingers are transforming to a new gesture type, we cannot 
produce error judgments at any rate. Our solution is to add an extra sensor in the wrist area as shown in 
Figure lb, which works as a delimiter corresponding to the bending of the wrist to confirm a gesture 
input. Our experience suggests that we could correctly set a standard default value to determine 
whether a hand wrist is bent without learning. 
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Figure 1. (a) Five sensor data glove design, (b) Six sensor data glove design. 
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LED: light emitted diode PD: photodiode 

2.2. Finger Language Recognition Subsystem 

Given the fact that the finger situations for handicapped aphasiacs are far more complicated and 
varying than those of normal people, the traditional dichotomy model distinguishing merely the 
bending and stretching states is inadequate. To fulfill practical needs, gray box modeling is adopted 
instead, in which the concept of fuzzy sets is incorporated to address the issue of meaning 
impreciseness, as is true of facing the ubiquitous vague or imprecise statements in real-world linguistic 
information. Important techniques involved in the recognition of finger language components are 
shown in Figure 2, including feature values calculation, feature selection and radial basis function 
neural network-based classification. The major purpose is that given a finger gesture, this subsystem 
should determine the features of the gesture and then use them to correctly classify it into one of the 1 1 
classes, which correspond to the 11 language components as we defined in Table 1. 

Figure 2. Techniques used in the system. 



All features 



Feature 
values 
calculation 



Feature 
selection by 
multiple regression 



Text- to- speech 
translation 



Text 
production 



darn glove i 


Specific 
features 




Feature 
values 
calculation 




Radial basis function 
neural network-based 
classification 











Specific 
features 



V irtual keyb oard-sup ported 
text output 



^"Vlodel Application 



Model Building 



Sensors 2010, 10 



379 



A. Feature Values Calculation 



This calculates the corresponding values for the input features. Possible features that may play 
important roles in recognition of finger language are analyzed below along with the method to 
calculate their values: 

a. Of. Original optical signal strength of each finger 
Oi takes on values on the scale of 0 to 255. 

b. Ff. Bending degree of each finger 

Equation 1 below defines F t to be a function of Oi to calculate the bending degree for finger i. 

1 

(0,- a, )/(*,-£*,) 
0 



Oi>b t 
a t <0 { <i 
Oi<a t 



(1) 



where a t and b\ denote the signal values for the least bent and the least straight situations of finger /, 
respectively. The function is illustrated in Figure 3. Table 2 lists the respective ai and bi values for the 
five fingers of User4 (Row 4, Table 3). Take F\ (bending degree function for thumb) as an example. 
Then, at some specific Ou we have a corresponding bending degree F\ 9 as illustrated by Equation 2. 



Table 1. Finger language components needing learning. 
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Fi(0 1 ) = 



1 

(O 1 -80)/(240-80) 
0 



O l > 240 
80 < 0 1 < 240 
O x <80 



(2) 



c. Ni: Normalized value of original optical signal strength of each finger 
Equation 3 below defines Ni in terms of 0\. 

Nt = 0,7255 



(3) 



Table 2. Values a and Z?-parameterized bending degree function. 
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Figure 3. Bending degree function F,. 
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B. Feature Selection 

We use a regression technique to select the features. Suppose an output finger language component 
is represented by a value Y. Taking all the features described above, i.e., Ou Fu and Nu into account, 
then Fcan be mathematically expressed as Equation 4: 

Y = C 0 + (Q x O x + C 2 x 0 2 + C 3 x 0 3 + C 4 x 0 4 + C 5 x 0 5 ) + (C 6 x + C 7 x F 2 + C 8 x F 3 + C 9 x F 4 + C 10 x F 5 ) 
+ (C 11 xiV 1 + C 12 x A^ 2 + C 13 x N 3 + C 14 x yY 4 + C 15 x N 5 ) 



(4) 



where C/'s with / = 0,1,2,. . .,15, are coefficients to be determined. 

We have applied the backward stepwise procedure of multiple regression analysis to the data 
produced by User 4 (Row 4, Table 3). The procedure begins with all the variables assumed to be 
relevant to the response. It proceeds by eliminating one least significant variable at a time (at each 
step) and continues the process until no variables can be removed according to some elimination 
criterion based upon discriminant function analysis [12]. Eventually, it is found that O t and Ni are 
largely statistically insignificant and can be neglected. In Table 3, we also show the customized finger 
features for ten users. In general, the bending degree features (F,-) are most significant; they are 
required for almost every user. Some patients did need other features to obtain a better result, among 
which N2 and are most significant. 

With this, the above formula for a specific user can be secured as Equation 5: 

y = C 0 + C 6 xF 1 + C 7 xF 2 + C 8 xF 3 + C 9 xF 4 + C 10 xF 5 (5) 
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Table 3. Feature selection. 





Features 


Goodness of fit (R2) 


T Tqpt1 


_ p c lSJi 1\U 
r l r 5? iX| 2 


0.99 


Userz 


Fi - F 5 , N 2 , N 3 


n qq 


T T O 

User3 


T" 1 T" 1 XT XT XT XT 

Fi -F 5 , Ni, N 2 , N 4 , N 5 


0.99 


User4 


Fi -F 5 


U.99 


User5 


Fi-F 5 ,Ni-N 5 


0.99 


User6 


Fi-F 5 


0.99 


User7 


F1-F5 


0.99 


User8 


Fi-F 5 


0.99 


User9 


F1-F4.N4.N5 


0.99 


UserlO 


F!-F 5 , N 2 ,N 4 , N 5 


0.99 



The goodness of fit (R2) of Equation 5 is calculated to be 0.99, 0 =SS R2 ^ 1, which means using 
this regressed equation can successfully explain 99% of variance of F, that is to say, Equation 5 has 
high goodness of fit. 

C. Radial Basis Function Neural Network-Based Classification 

We are employing the radial basis function neural network (RBFNN) for classification. The 
architecture, illustrated in Figure 4, learns how to classify finger language components according to the 
selected input features. The selected feature values calculated in the above are first input at the input 
layer of the network, and after going through manipulations in the hidden layer neurons, 11 state 
values come out of the output layer, which correctly identify the corresponding components. Our 
implementation tool is MATLAB. 

Figure 4. Radial basis function neural network-based classification. 
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Initially the hidden layer has no neurons; the neurons are then added one by one until the network's 
mean squared error falls below 0.001. In our case, the number of neurons in hidden layer is settled 
to 93 for the user under experiment. 

2.3. Virtual Keyboard-Supported Text Output 

Shown also in Figure 2 is the virtual keyboard that supports the text output. Our virtual keyboard is 
so designed that a handicapped aphasiac can create text from the keyboard on the screen by wearing 
the data glove to directly select the letters on the keyboard. The virtual keyboard is a two dimensional 
array of size 7x7. We use the 12 language components to manipulate the keyboard, including 
Component 0 till Component 10 as illustrated in Table 1 and an extra Component 11 representing wrist 
bending. Specifically, Component 1 up to Component 7 is used to designate the coordinates X and Y 
of a cell on the keyboard to access the letter in the cell. Component 8 is used to erase an incorrect 
gesture input and signal a new gesture input to follow. Component 1 1 works as a delimiter to confirm a 
coordinate input so that another coordinate input can correctly follow. Finally, Components 0 and 9 are 
used to designate input language, i.e., English or Chinese, respectively. The layouts for English and 
Chinese keyboards are respectively shown in Figures 6 and 7. For example, to input English letter E 
(cell (5, 1) on the English keyboard), a sequence of four components are used: Component 5 (X 
coordinate of E), Component 11, Component 1 (Y coordinate of E), and finally Component 11, as 
illustrated in Table 4. By this way, a handicapped aphasiac can correctly enter any sequences of letters 
to compose his text. Moreover, the dwell time of the key input can be set individually to reflect 
personal taste. In a word, the virtual keyboard can successfully extend our previous static finger 
language recognition capability [13] into a continuous finger language recognition system and creates 
an intelligent text input environment for handicapped aphasiacs. 

Note that the English virtual keyboard looks rather different from the traditional layout. The letters 
are arranged according to the alphabetical sequence as shown in Figure 6. This is because our input 
method has nothing to do with the statistics about how often letters are used in composing words and it 

Table 4. Exemplified input sequence for letter E (containing 3 steps/5 gestures). 
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Delimiter Gesture 


Wrist Bent 



Sensors 2010, 10 



383 



It costs the same to input any letter by designating the coordinates X and Y of the corresponding 
cell. The Chinese virtual keyboard is specifically developed for Taiwanese users for they use the 
Taiwan phonetic system to enter words. The keyboard supports the 37 phonetic symbols, the 5 tone 
symbols — short tone (•), tone one (space), tone two (/), tone three (v), and tone four (\), the conjugate 
characters, and the numeric characters. Characters like space, backspace, and carriage return are also 
supported. When a user inputs a phonetic symbol, e.g., ? \ the system will list all the homophones on 
an associated menu, as illustrated in Figure 5. The user then uses components 1 to 9 to choose the 
correct word. Component 10 works for "Page down" by turning to the next page when the 
homophones span more than one page. Component 11 is used to 'Confirm the choice' as before. The 
number keys on the Chinese keyboard are for continuous digit input, for example, telephone number. 

Figure 5. The homophones of an input phonetic symbol. 
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This virtual keyboard design helps a patient enter text without fooling around his hands over the 
keyboard. Instead, he can select a most suitable speed for text input without being forced to make 
delicate movements and prompt responses from AAC facilities. 

Figure 6. English virtual keyboard. 
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Figure 7. Chinese virtual keyboard. 
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3. System Development and Application 

A typical scenario of system development for a specific user is illustrated as follows: 

Step 1: Adjustment of the glove: The user adjusts PDs and LEDs associated with the five fingers, 

respectively, so that for each finger a dynamic range of the grayscale covering 0 to 255 (or, maximally 

achievable) can be secured. 

Step 2: Tailoring of values a and b: The values of a and b are specifically tailored for the user 

by Equation 6, where / = 1, 2, 3, 4, 5 stands for finger i of the user. 

{ai = max(optical intensity when finger i is bent) 
(6) 
bi = min(optical intensity when finger i is straight) 

Step 3: Creation of training data: The user is instructed to make static finger items designating 
numbers 0, 1, 2, to 10 and 100 times each, in an attempt to cover the widest variations in finger signal 
detection. In the end, each user should produce 11 x 100= 1,100 records of data. 

Step 4: Feature selection: The features specifically working best for the users are decided by 
multiple regression analysis applied on the above training data. 

Step 5: Model training: The training data in Step 3 that are related to the selected features are 
singled out as the training data set and fed into the neural network, which eventually learns a radial 
basis function for each hidden neuron. After the stratified tenfold cross validation, the neural network 
finalizes to the best recognition model and is ready for use. 

Table 5 shows the results of stratified tenfold cross validation for the ten users, showing 100% of 
correctness. At this step, we can make sure the customized static finger language recognition function 
works well. Note that the customization is required because we find each handicapped aphasiac has his 
unique finger positions, sizes, shapes, and cross-finger-influence while bending fingers. Now we can 
turn the system into a continuous finger language recognition system with the support of the two 
dimensional on-screen keyboard. 
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Table 5. Results of stratified tenfold cross validation. 
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Now the system can be put in everyday use by the user: 

Step 1: Virtual keyboard adjustment: At this step, the subject is trained to get familiar with the 
manipulation of the data glove plus the on-screen virtual keyboard so that he feels comfortable with 
the controlling of a computer from his AAC system. 

Step 2: Text production: The user puts on the glove and makes gestures; the recognition system 
recognizes the corresponding language components and the virtual keyboard displays the designated 
letters. The letters finally develop into a sentence. 

Step 3: Speech production: The intended sentence is finally sent to the text-to-speech module for 
speech production. 

4. Summary and Discussions 

This paper describes an intelligent communication system that can work as a general mode of 
communication for handicapped aphasiacs in Taiwan. It produces text for communication by 
employing three key techniques: a data glove to input finger language components, a finger language 
recognition subsystem to recognize language components, and a virtual keyboard to produce text from 
language components. A handicapped aphasiac can enter text via the virtual keyboard by wearing a 
data glove to make proper finger gestures and use them to directly select the letters on the keyboard. 

Some interesting features of the system are as follows. First, the system is customizable by allowing 
easy adjustment of the data glove to make it fit to specific handicapped aphasiacs. It employs a 
function, the shape of which is easily adjustable for each finger, to effectively calculate the bending 
degrees of the fingers. It employs multiple regression analysis to automatically extract proper finger 
features so that the recognition model can be fast and correctly constructed by a radial basis function 
neural network. Finally, the coordinate-indexed virtual keyboard works as a continuous text input 
media, allowing the users to access the letters on the keyboard with a most becoming speed. 

The major contribution of the work is that the virtual keyboard can be considered as a valuable 
low-cost enhancement to the data glove to effectively help handicapped aphasiacs. We have demonstrated 
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that efficient and effective manipulation of the virtual keyboard using finger-language-based 
interaction techniques is possible. This will greatly enhance the possibilities for physically-challenged 
users to work, on a more equitable basis, with their able-bodied colleagues with the help of the same 
software products. 

This work has opened a possibility for us to extend our research to the recognition of dynamic 
finger gestures in the future, which allows more real applications of the system. As a matter of fact, the 
virtual keyboard, arranged by the alphabetical order, has been tested and confirmed by several users to 
be suitable for typing letters. Another further extension is to improve the customization process by 
pre-developing a "pseudo-general" finger language recognition system with the support of the 
development of a database that accommodates the finger gestures from comprehensive representative 
handicapped aphasiacs. With this, we then can quickly create a customized system for a specific user 
by performing a slight adjustment to the system. 
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